If you don’t already have these packages installed, run this code
install.packages("plotly")
install.packages("ggplot2")
install.packages("dplyr")
Run this code, Required pacakges for activity
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Introduction
Data visualization is a very important tool in the field of Data Science. It helps people share the findings of there research/analysis with a wide variety of people from different backgrounds through effective visuals. At Macalester, the most common data viz.package we are taught in R is ggplot. This package allows us to input a data frame and turn it into a graphic where the user can specify certain aesthetic parameters to create a desired visualization.
For this activity, we will be building upon our knowledge of data visualization and learn a new skill called plotly, “an Interactive web-based data visualization” that can be used in R and python. This package allows us to take in data and turn it into a interactive visualization that can enhance the message of the analysis. Through the completion of this activity and reflection points, the hope is that you will learn a new skill that you can add to your bag of tricks.
Before begining this activity, please look through this article that covers the basics and syntax of plotting with plotly. Throughout this activity, if anything is unclear, please look back at this reference for code help. - https://plotly-r.com/
Section 1: Basics
For this activity, we will be using the mtcars data set that is built into r.
glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
## $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
## $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
## $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
## $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
## $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…
As mentioned above, we have commonly been taught to use ggplot to make visualizations, tt is an effective package that allows users a vast amount of options. Included below is a scatter plot using ggplot and the mtcars dataset.
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
geom_point()+
labs(
x = "Weight (X thousand lbs)", y = "Miles per gallon",
title = "Fuel effiency by weight", color = "Cylinders") +
theme_minimal()
reflection
What are the certain parts of the ggplot call doing to make the visual?
What can be improved to this graph to make it more informative?
Answer here:
Section 1, part b
Just like ggplot, Plotly allows us to create visualizations, but with slightly different formatting. Below are several examples of common graph types—this time using Plotly.
As you run each cell, interact with the plots:
Try:
Zooming in (double-click to zoom back out)
Hovering over points/bars
Thinking about what information becomes easier to see interactively
plot_ly(
data = mtcars, x = ~wt, y = ~mpg, type = "scatter", mode = "markers") %>%
layout(title = "Scatter Plot: Weight vs. MPG", xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"))
plot_ly(
data = mtcars, x = ~factor(cyl), type = "histogram") %>%
layout( title = "Histogram of Cylinder Count", xaxis = list(title = "Number of Cylinders"), yaxis = list(title = "Count of Cars"))
mtcars_sorted <- mtcars %>% arrange(hp)
plot_ly(
data = mtcars_sorted, x = ~hp, y = ~mpg, type = "scatter", mode = "lines") %>%
layout(title = "Line Plot: MPG Across Increasing Horsepower", xaxis = list(title = "Horsepower"), yaxis = list(title = "Miles per Gallon"))
plot_ly(
data = mtcars, x = ~wt, y = ~mpg, z = ~hp, color = ~factor(cyl), type = "scatter3d", mode = "markers") %>%
layout(
title = "3D Scatter Plot: Weight, MPG, and Horsepower", scene = list( xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"), zaxis = list(title = "Horsepower")),
legend = list(title = list(text = "Cylinders")))
reflection
How does interactivity (hover, zoom, filtering) change the way you understand the information in these graphs compared to static ggplot visuals?
what types of data scenarios do you think using Plotly would add meaningful value vs. when a static ggplot might be more appropriate?
Answer here:
Section 2: Making the graph interactive
Returning to the original ggplot scatterplot, there are two ways to convert it into an interactive Plotly visualization.
Approach 1. using ggplot again and letting plotly handle it. As you can see, it is the same code we used with the ggplot calls which creates this great interactive. If you hover over the points, you are able to see the what the points axis points are.
p <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
geom_point() +
labs(
x = "Weight (X thousand lbs)",
y = "Miles per gallon",
title = "Fuel efficiency by weight",
color = "Cylinders"
) +
theme_minimal()
# ggplotly(p)
Approach 2. This approach uses Plotly syntax from the start, giving more control over features like hover labels, legends, colors, and marker styling.
plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
color = ~factor(cyl),
colors = "Set1",
type = "scatter",
mode = "markers",
marker = list(size = 10)
) %>%
layout(
title = "Fuel efficiency by weight",
xaxis = list(title = "Weight (X thousand lbs)"),
yaxis = list(title = "Miles per gallon"),
legend = list(title = list(text = "Cylinders"))
)
This activity only introduces the basics of Plotly, but the package offers many additional tools and visualization types. We encourage you to explore further and try out different interactive features and plot options.
Key Takeaways: - Plotly uses similar concepts as ggplot but with different syntax
Basic plots can be created with few lines of code (its pretty easy to make it interactive!)
Interactivity makes it different from ggplot(hovering, zooming, and rotating to help make patterns more clear)
Plotly is most ideal when you want the user to explore and learn on there own a bit
So far, we’ve used Plotly to make basic interactive plots where you can zoom and hover. But the default hover labels and colors aren’t always the most helpful.
Plotly lets you:
Customize hover labels: decide exactly which variables appear when you hover over a point, and how they’re formatted (e.g., “Weight: 2.5, MPG: 22”).
Control color: map color to a variable (like number of cylinders) and choose a color palette that makes groups easy to compare.
Style markers: change size, opacity, and sometimes symbol, which helps highlight important points or avoid overplotting.
This combination makes your graph feel more like a little “data app” — each hover tells a small story about that specific car.
Here’s a customized scatterplot using mtcars. We’ll:
cyl (number of cylinders)plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl),
colors = "Dark2",
marker = list(size = 12, opacity = 0.8),
text = ~paste(
"Model:", rownames(mtcars),
"<br>Weight:", wt,
"<br>MPG:", mpg,
"<br>Horsepower:", hp,
"<br>Gears:", gear
),
hoverinfo = "text"
) %>%
layout(
title = "Fuel Efficiency with Custom Hover Labels",
xaxis = list(title = "Weight (thousand lbs)"),
yaxis = list(title = "Miles per Gallon"),
legend = list(title = list(text = "Cylinders"))
)
When you hover over points now, you don’t just see numbers — you get a mini “profile” of each car.
Exercise
Goal: Practice customizing hover labels and styling so the plot tells a clearer story.
Start from the example above (you can copy/paste it).
Make the following three changes:
qsec or carb).color.size, opacity, or symbol.Use this chunk as your starting point:
# Exercise: Customize hover labels and styling
plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
# TODO: choose your color mapping and palette
color = ~factor(cyl),
colors = "Set1",
marker = list(
size = 10, # you can change this
opacity = 0.9 # and this
),
text = ~paste(
# TODO: customize your hover text
"Model:", rownames(mtcars),
"<br>Weight:", wt,
"<br>MPG:", mpg
),
hoverinfo = "text"
) %>%
layout(
title = "Your Customized Interactive Plot",
xaxis = list(title = "Weight (thousand lbs)"),
yaxis = list(title = "Miles per Gallon")
)
Short reflection (answer in text under the chunk):
Interactivity isn’t just about pretty hover labels — it’s also about controlling which data you see and how closely you look at it.
There are two main ideas here:
Filtering the data before plotting Using
dplyr::filter(), you can focus on a subset of the data
(e.g., only cars with high MPG, or only 4-cylinder cars). This makes
your interactive plot more targeted.
Zooming and panning inside Plotly Once the plot is rendered, you can:
Combining filtering + zooming lets you move between “big picture” and “close-up” views of your dataset.
Example A: Filter to only 4-cylinder cars
mtcars_4cyl <- mtcars %>%
filter(cyl == 4)
plot_ly(
data = mtcars_4cyl,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
marker = list(size = 10),
color = ~factor(gear)
) %>%
layout(
title = "Filtered Plot: Only 4-Cylinder Cars",
xaxis = list(title = "Weight"),
yaxis = list(title = "Miles per Gallon"),
legend = list(title = list(text = "Gears"))
)
Example B: Zooming on an unfiltered plot
plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl)
) %>%
layout(
title = "Try Zooming and Panning (Drag to zoom, double-click to reset)",
xaxis = list(title = "Weight"),
yaxis = list(title = "Miles per Gallon")
)
Try: click-and-drag to zoom into a cluster of points, then double-click to reset.
Exercise
Goal: Use both filtering and zooming to explore a subset of cars more deeply.
Use dplyr::filter() to pick one subset of
interest, such as:
mpg > 25hp > 150cyl == 6)am == 1)Make an interactive scatterplot of wt vs
mpg for that subset.
Interact with it:
Write 1–2 sentences about something you saw that you might not have noticed in the full dataset.
Scaffolded code:
# Exercise: Filter + zoom exploration
# 1. Filter the dataset (change this line to your own filter condition)
mtcars_subset <- mtcars %>%
filter(mpg > 25) # <-- edit this condition
# 2. Make an interactive scatterplot
plot_ly(
data = mtcars_subset,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl),
marker = list(size = 10)
) %>%
layout(
title = "Filtered & Zoomable Plot",
xaxis = list(title = "Weight"),
yaxis = list(title = "Miles per Gallon")
)
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
Underneath, have them answer:
Reflection:
AI useage - AI assistance was used to improve grammar, clarity, and spelling.